Goto

Collaborating Authors

 dev and test



Development of Hybrid ASR Systems for Low Resource Medical Domain Conversational Telephone Speech

Lüscher, Christoph, Zeineldeen, Mohammad, Yang, Zijian, Raissi, Tina, Vieting, Peter, Le-Duc, Khai, Wang, Weiyue, Schlüter, Ralf, Ney, Hermann

arXiv.org Artificial Intelligence

Language barriers present a great challenge in our increasingly connected and global world. Especially within the medical domain, e.g. hospital or emergency room, communication difficulties and delays may lead to malpractice and non-optimal patient care. In the HYKIST project, we consider patient-physician communication, more specifically between a German-speaking physician and an Arabic- or Vietnamese-speaking patient. Currently, a doctor can call the Triaphon service to get assistance from an interpreter in order to help facilitate communication. The HYKIST goal is to support the usually non-professional bilingual interpreter with an automatic speech translation system to improve patient care and help overcome language barriers. In this work, we present our ASR system development efforts for this conversational telephone speech translation task in the medical domain for two languages pairs, data collection, various acoustic model architectures and dialect-induced difficulties.


End-to-End Speech Translation of Arabic to English Broadcast News

Bougares, Fethi, Jouili, Salim

arXiv.org Artificial Intelligence

Speech translation (ST) is the task of directly translating acoustic speech signals in a source language into text in a foreign language. ST task has been addressed, for a long time, using a pipeline approach with two modules : first an Automatic Speech Recognition (ASR) in the source language followed by a text-to-text Machine translation (MT). In the past few years, we have seen a paradigm shift towards the end-to-end approaches using sequence-to-sequence deep neural network models. This paper presents our efforts towards the development of the first Broadcast News end-to-end Arabic to English speech translation system. Starting from independent ASR and MT LDC releases, we were able to identify about 92 hours of Arabic audio recordings for which the manual transcription was also translated into English at the segment level. These data was used to train and compare pipeline and end-to-end speech translation systems under multiple scenarios including transfer learning and data augmentation techniques.


Strategic Management of Machine Learning Projects

#artificialintelligence

You can sometimes break an end-to-end model into two and introduce a hand-designed component in the middle that extracts some features or does some processing to make the whole system much better. For instance, you might find that a model where there is a hand-designed component that crops to the person's face before starting on the facial recognition task when a human is found to exist in an image makes a better face recognition system compared to one that's completely end-to-end.


Machine Learning Strategies: Part 2

#artificialintelligence

Building a commercial machine learning application is a challenging task. Therefore, following promising directions would save you a lot of time. In the previous article, I mentioned scales that drive machine learning progress. Building the proper model needs the right dataset. In this article, I will discuss dataset selection and how to make your dataset for machine learning models.

  Country:

6 Key Concepts in Andrew Ng's "Machine Learning Yearning"

#artificialintelligence

Machine Learning Yearning is about structuring the development of machine learning projects. The book contains practical insights that are difficult to find somewhere else, in a format that is easy to share with teammates and collaborators. Most technical AI courses will explain to you how the different ML algorithms work under the hood, but this book teaches you how to actually use them. If you aspire to be a technical leader in AI, this book will help you on your way. Historically, the only way to learn how to make strategic decisions about AI projects was to participate in a graduate program or to gain experience working at a company.


Machine Learning Yearning, Memeified.

#artificialintelligence

Welcome to the first edition of Memeified AI! Memeifieid AI summarizes important works in the AI field as memes and tweets, so that busy people can get a sense of what's inside. The titles shown in bold are straight from Andrew's book. The memes and tweetish summaries are mine, so any errors are clearly me just thinking unclearly. For the curious, I also provide a 20-minute talk with Q&A if your group is into that sort of thing. Before we get started, I thought I'd share a personal story about how these memefieid posts came about, so you can understand the source of my madness.


6 concepts of Andrew NG's book: "Machine Learning Yearning"

#artificialintelligence

Andrew NG is a computer scientist, executive, investor, entrepreneur, and one of the leading experts in Artificial Intelligence. He is the former Vice President and Chief Scientist of Baidu, an adjunct professor at Stanford University, the creator of one of the most popular online courses for machine learning, the co-founder of Coursera.com At Baidu, he was significantly involved in expanding their AI team into several thousand people. The book starts with a little story. Imagine, you want to build the leading cat detector system as a company.